Search CORE

44 research outputs found

Sparse Exploratory Factor Analysis

Author: A Edelman
C Hage
HH Harman
IT Jolliffe
J Choi
K Hirose
K Hirose
KG Jöreskog
Kohei Adachi
MATLAB
N Boumal
N Buono Del
Nickolay T. Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
P-A Absil
R Luss
SA Mulaik
Sara Fontanella
Z Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/07/2017
Field of study

Sparse principal component analysis is a very active research area in the last decade. It produces component loadings with many zero entries which facilitates their interpretation and helps avoid redundant variables. The classic factor analysis is another popular dimension reduction technique which shares similar interpretation problems and could greatly benefit from sparse solutions. Unfortunately, there are very few works considering sparse versions of the classic factor analysis. Our goal is to contribute further in this direction. We revisit the most popular procedures for exploratory factor analysis, maximum likelihood and least squares. Sparse factor loadings are obtained for them by, first, adopting a special reparameterization and, second, by introducing additional [Formula: see text]-norm penalties into the standard factor analysis problems. As a result, we propose sparse versions of the major factor analysis procedures. We illustrate the developed algorithms on well-known psychometric problems. Our sparse solutions are critically compared to ones obtained by other existing methods

Crossref

Open Research Online (The Open University)

Spiral - Imperial College Digital Repository

Archetypal Analysis: Mining Weather and Climate Extremes

Author: A. Hannachi
Boumal N.
Chan B. H. P.
N. Trendafilov
Obukov A. M.
Publication venue: 'American Meteorological Society'
Publication date: 01/09/2017
Field of study

Conventional analysis methods in weather and climate science (e.g., EOF analysis) exhibit a number of drawbacks including scaling and mixing. These methods focus mostly on the bulk of the probability distribution of the system in state space and overlook its tail. This paper explores a different method, the archetypal analysis (AA), which focuses precisely on the extremes. AA seeks to approximate the convex hull of the data in state space by finding “corners” that represent “pure” types or archetypes through computing mixture weight matrices. The method is quite new in climate science, although it has been around for about two decades in pattern recognition. It encompasses, in particular, the virtues of EOFs and clustering. The method is presented along with a new manifold-based optimization algorithm that optimizes for the weights simultaneously, unlike the conventional multistep algorithm based on the alternating constrained least squares. The paper discusses the numerical solution and then applies it to the monthly sea surface temperature (SST) from HadISST and to the Asian summer monsoon (ASM) using sea level pressure (SLP) from ERA-40 over the Asian monsoon region. The application to SST reveals, in particular, three archetypes, namely, El Niño, La Niña, and a third pattern representing the western boundary currents. The latter archetype shows a particular trend in the last few decades. The application to the ASM SLP anomalies yields archetypes that are consistent with the ASM regimes found in the literature. Merits and weaknesses of the method along with possible future development are also discussed

Crossref

Open Research Online (The Open University)

Tri-PLS for compositional data

Author: Gallo Michele
MARTíN FERNANDEZ JA
Todorov V
Trendafilov N
Publication venue
Publication date: 01/01/2014
Field of study

Università degli Studi di Napoli L'Orientale: CINECA IRIS

Classification in sparse, high dimensional environments applied to distributed systems failure prediction

Author: A.S. Tanenbaum
B. Schroeder
F. Salfner
G. King
H. Zou
M. Gallet
N. Trendafilov
W. Ahmed
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Network failures are still one of the main causes of distributed systems’ lack of reliability. To overcome this problem we present an improvement over a failure prediction system, based on Elastic Net Logistic Regression and the application of rare events prediction techniques, able to work with sparse, high dimensional datasets. Specifically, we prove its stability, fine tune its hyperparameter and improve its industrial utility by showing that, with a slight change in dataset creation, it can also predict the location of a failure, a key asset when trying to take a proactive approach to failure management

Crossref

Archivo Digital UPM

Semi-sparse PCA

Author: A Edelman
DM Witten
EJ Candès
GH Golub
H Shen
HH Harman
IT Jolliffe
J Leeuw De
J-F Cai
JH Steiger
JH Steiger
K Adachi
L Eldén
Lars Eldén
M Journée
N Trendafilov
Nickolay Trendafilov
NT Trendafilov
P-A Absil
S Unkel
SA Armstrong
SA Mulaik
SA Mulaik
X Yuan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/03/2019
Field of study

It is well-known that the classical exploratory factor analysis (EFA) of data with more observations than variables has several types of indeterminacy. We study the factor indeterminacy and show some new aspects of this problem by considering EFA as a specific data matrix decomposition. We adopt a new approach to the EFA estimation and achieve a new characterization of the factor indeterminacy problem. A new alternative model is proposed, which gives determinate factors and can be seen as a semi-sparse principal component analysis (PCA). An alternating algorithm is developed, where in each step a Procrustes problem is solved. It is demonstrated that the new model/algorithm can act as a specific sparse PCA and as a low-rank-plus-sparse matrix decomposition. Numerical examples with several large data sets illustrate the versatility of the new model, and the performance and behaviour of its algorithmic implementation

Publikationer från Linköpings universitet

Crossref

Open Research Online (The Open University)

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Recipes for sparse LDA of horizontal data

Author: A Marshall
A Montanari
A Rencher
B Flury
B Flury
BG Osborne
C Hage
D Bragoli
DG Calò
DM Witten
GH Golub
H Shin
IS Dhillon
IT Jolliffe
J Duchene
J Duintjer Tebbens
J Fan
JC Gower
JC Gower
L Clemmensen
M Ng
M Vichi
M Zou
ME Timmerman
N Boumal
N Hao
NA Campbell
NT Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
NT Trendafilov
P Bickel
P-A Absil
R Tibshirani
RA Fisher
S Mussard
T Cai
T Hastie
TP Conrads
W Gander
WJ Krzanowski
WJ Krzanowski
WJ Krzanowski
WJ Krzanowski
WJ Krzanowski
Z Wen
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Many important modern applications require analyzing data with more variables than observations, called for short horizontal. In such situation the classical Fisher’s linear discriminant analysis (LDA) does not possess solution because the within-group scatter matrix is singular. Moreover, the number of the variables is usually huge and the classical type of solutions (discriminant functions) are difficult to interpret as they involve all available variables. Nowadays, the aim is to develop fast and reliable algorithms for sparse LDA of horizontal data. The resulting discriminant functions depend on very few original variables, which facilitates their interpretation. The main theoretical and numerical challenge is how to cope with the singularity of the within-group scatter matrix. This work aims at classifying the existing approaches according to the way they tackle this singularity issue, and suggest new ones

Crossref

Springer - Publisher Connector

Open Research Online (The Open University)

A maximum likelihood method for an asymmetric MDS model

Author: Agresti
Aitchison
Akaike
Ashby
Birch
Bishop
Bock
Bowker
Caussinus
Chino
Chino
Chino
Constantine
Cramér
de Leeuw
De Rooij
De Rooij
Escoufier
Goodman
Goodman
Gower
Harshman
Holman
Jeffres
Kiers
Krumhansl
Kruskal
Kruskal
Kruskal
Luce
Luce
McNemar
N. Chino
Nosofsky
Okada
Okada
Ramsay
Read
Rocci
Rushen
S. Saburi
Saito
Saito
Sato
Schwarz
Schönemann
Smith
Stuart
Takane
Takane
ten Berge
Tobler
Torgerson
Townsend
Trendafilov
Wald
Weeks
Winsberg
Yang
Zielman
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

Author: A. Jain
A. Montanari
A. Raftery
C. Biernacki
C. Biernacki
C. Bishop
C. Bouveyron
C. Fraley
C. Maugis
Camille Brunet
Charles Bouveyron
D. Foley
D. Rubin
D. Scott
D.A. Clausi
E. Anderson
E. Tipping
G. Celeux
G. Celeux
G. Golub
G. Kimeldorf
G. McLachlan
G. McLachlan
G. McLachlan
G. Schwarz
H. Akaike
I. Jolliffe
J. Baek
J. Friedman
J. Ye
J. Ye
K. Fukunaga
K. Liu
L. Parsons
M. Law
N. Campbell
N. Trendafilov
P. Howland
P. McNicholas
R. Agrawal
R. Bellman
R. Duda
R. Fisher
S. Boutemedjet
T. Alexandrov
T. Hastie
T. Hastie
W. Krzanowski
Y. Hamamoto
Y.F. Guo
Z. Jin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/04/2011
Field of study

Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach performs better than existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data

arXiv.org e-Print Archive

HAL Evry

Crossref

HAL-Paris1

Recommended from our members

A majorization algorithm for simultaneous parameter estimation in robust exploratory factor analysis

Author: Trendafilov N. T.
Unkel S.
Publication venue: 'Elsevier BV'
Publication date: 01/12/2010
Field of study

A new approach for fitting the exploratory factor analysis (EFA) model is considered. The EFA model is fitted directly to the data matrix by minimizing a weighted least squares (WLS) goodness-of-fit measure. The WLS fitting problem is solved by iteratively performing unweighted least squares fitting of the same model. A convergent reweighted least squares algorithm based on iterative majorization is developed. The influence of large residuals in the loss function is curbed using Huber’s criterion. This procedure leads to robust EFA that can resist the effect of outliers in the data. Applications to real and simulated data illustrate the performance of the proposed approach

Open Research Online (The Open University)

Sparse PCA for compositional data

Author: GALLO Michele
Trendafilov N
Publication venue: place:Cagliari
Publication date: 01/01/2014
Field of study

A great number of procedures for sparse principal component analysis (PCA) were proposed in the last decade. However, they cannot be applied directly for PCA of compositional data (CoDa). We introduce a new procedure for sparse PCA which takes into account the additional constraints specific for CoDa. The proposed method is very effective to find logcontrasts in data, which is illustrated on a real example

ARCHIVIO ISTITUZIONALE DELLA RICERCA-UNIVERSITA' DEGLI STUDI DI NAPOLI "L'ORIENTALE"